The main function to calculate the quality metrics is sesameQC_calcStats. SeSAMe divides sample quality metrics into multiple groups. Each group of quality metrics can be generated using a dedicated function. All these functions can be accessed through the sesameQC_calcStats function. sesameQC_calcStats() takes a SigDF, calculates the QC statistics and returns a single object of the class sesameQC which can be printed directly to the console.
##
## =====================
## | Dye Bias
## =====================
## Median Inf.I Intens. Red : 9150.00 (medR)
## Median Inf.I Intens. Grn : 6031.50 (medG)
## Median of Top 20 Inf.I Intens. Red : 35521.00 (topR)
## Median of Top 20 Inf.I Intens. Grn : 16026.50 (topG)
## Ratio of Red-to-Grn median Intens. : 1.52 (RGratio)
## Ratio of Top vs. Global R/G Ratios : 1.46 (RGdistort)
##
## =====================
## | Detection
## =====================
## N. Probes w/ Missing Raw Intensity : 0 (num_dtna)
## % Probes w/ Missing Raw Intensity : 0.0 % (frac_dtna)
## N. Probes w/ Detection Success : 828589 (num_dt)
## % Detection Success : 95.6 % (frac_dt)
## N. Detection Succ. (after masking) : 828589 (num_dt_mk)
## % Detection Succ. (after masking) : 95.6 % (frac_dt_mk)
## N. Probes w/ Detection Success (cg) : 825833 (num_dt_cg)
## % Detection Success (cg) : 95.7 % (frac_dt_cg)
## N. Probes w/ Detection Success (ch) : 2500 (num_dt_ch)
## % Detection Success (ch) : 85.3 % (frac_dt_ch)
## N. Probes w/ Detection Success (rs) : 57 (num_dt_rs)
## % Detection Success (rs) : 96.6 % (frac_dt_rs)
The sesameQC_calcStats function returns an S4 sesameQC object. The choice of QC metrics depends on the 2nd argument of the sesameQC_calcStats function. This argument is optional and can take one or a list of the following keys.
| Short.Key | Description |
|---|---|
| detection | Signal Detection |
| numProbes | Number of Probes |
| intensity | Signal Intensity |
| channel | Color Channel |
| dyeBias | Dye Bias |
| betas | Beta Value |
For example, “intensity” generates signal intensity related quality metrics. When the 2nd argument is not given, all stats will be calculated. We consider signal detection the most important QC metric. One can retrieve the actual stat numbers from sesameQC using the sesameQC_getStats:
## [1] 0.9561896
One can combine multiple sesameQC into a data frame:
SeSAMe features comparison of your sample with public data sets. The sesameQC_rankStats() function ranks the input sesameQC object with sesameQC calculated from public datasets. It shows the rank percentage of the input sample as well as the number of datasets compared.
##
## =====================
## | Signal Intensity
## =====================
## Mean sig. intensity : 3171.21 (mean_intensity)
## Mean sig. intensity (M+U) : 6342.41 (mean_intensity_MU)
## Mean sig. intensity (Inf.II) : 2991.85 (mean_ii)
## Mean sig. intens.(I.Grn IB) : 3004.33 (mean_inb_grn)
## Mean sig. intens.(I.Red IB) : 4670.97 (mean_inb_red)
## Mean sig. intens.(I.Grn OOB) : 318.55 (mean_oob_grn)
## Mean sig. intens.(I.Red OOB) : 606.99 (mean_oob_red)
## N. NA in M (all probes) : 0 (na_intensity_M)
## N. NA in U (all probes) : 0 (na_intensity_U)
## N. NA in raw intensity (IG) : 0 (na_intensity_ig)
## N. NA in raw intensity (IR) : 0 (na_intensity_ir)
## N. NA in raw intensity (II) : 0 (na_intensity_ii)
##
## =====================
## | Signal Intensity
## =====================
## Mean sig. intensity : 3171.21 (mean_intensity) - Rank 15.7% (N=636)
## Mean sig. intensity (M+U) : 6342.41 (mean_intensity_MU)
## Mean sig. intensity (Inf.II) : 2991.85 (mean_ii) - Rank 15.6% (N=636)
## Mean sig. intens.(I.Grn IB) : 3004.33 (mean_inb_grn) - Rank 7.5% (N=636)
## Mean sig. intens.(I.Red IB) : 4670.97 (mean_inb_red) - Rank 21.2% (N=636)
## Mean sig. intens.(I.Grn OOB) : 318.55 (mean_oob_grn) - Rank 4.2% (N=636)
## Mean sig. intens.(I.Red OOB) : 606.99 (mean_oob_red) - Rank 3.6% (N=636)
## N. NA in M (all probes) : 0 (na_intensity_M)
## N. NA in U (all probes) : 0 (na_intensity_U)
## N. NA in raw intensity (IG) : 0 (na_intensity_ig)
## N. NA in raw intensity (IR) : 0 (na_intensity_ir)
## N. NA in raw intensity (II) : 0 (na_intensity_ii)
SeSAMe provides functions to create QC plots. Some functions takes sesameQC as input while others directly plot the SigDF objects. Here are some examples:
sesameQC_plotBar() takes a list of sesameQC objects and creates bar plot for each metric calculated.
sesameQC_plotRedGrnQQ() graphs the dye bias between the two color channels.
sesameQC_plotIntensVsBetas() plots the relationship between β values and signal intensity and can be used to diagnose artificial readout and influence of signal background.
sesameQC_plotHeatSNPs() plots SNP probes and can be used to detect sample swaps.
More about quality control plots can be found in Supplemental Vignette.
## R Under development (unstable) (2021-11-09 r81170)
## Platform: x86_64-apple-darwin20.6.0 (64-bit)
## Running under: macOS Big Sur 11.6.2
##
## Matrix products: default
## BLAS: /Users/zhouw3/.Renv/versions/4.2.dev/lib/R/lib/libRblas.dylib
## LAPACK: /Users/zhouw3/.Renv/versions/4.2.dev/lib/R/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] knitr_1.37 sesame_1.13.38 sesameData_1.13.33
## [4] ExperimentHub_2.3.5 AnnotationHub_3.3.9 BiocFileCache_2.3.4
## [7] dbplyr_2.1.1 BiocGenerics_0.41.2 rmarkdown_2.11
## [10] R6_2.5.1
##
## loaded via a namespace (and not attached):
## [1] bitops_1.0-7 matrixStats_0.61.0
## [3] bit64_4.0.5 filelock_1.0.2
## [5] RColorBrewer_1.1-2 httr_1.4.2
## [7] GenomeInfoDb_1.31.6 tools_4.2.0
## [9] bslib_0.3.1 utf8_1.2.2
## [11] DBI_1.1.2 colorspace_2.0-3
## [13] withr_2.5.0 tidyselect_1.1.2
## [15] preprocessCore_1.57.0 bit_4.0.4
## [17] curl_4.3.2 compiler_4.2.0
## [19] cli_3.2.0 Biobase_2.55.0
## [21] DelayedArray_0.21.2 sass_0.4.1
## [23] scales_1.1.1 readr_2.1.2
## [25] rappdirs_0.3.3 stringr_1.4.0
## [27] digest_0.6.29 XVector_0.35.0
## [29] pkgconfig_2.0.3 htmltools_0.5.2
## [31] MatrixGenerics_1.7.0 highr_0.9
## [33] fastmap_1.1.0 rlang_1.0.2
## [35] RSQLite_2.2.11 shiny_1.7.1
## [37] jquerylib_0.1.4 generics_0.1.2
## [39] jsonlite_1.8.0 wheatmap_0.2.0
## [41] BiocParallel_1.29.18 dplyr_1.0.8
## [43] RCurl_1.98-1.6 magrittr_2.0.2
## [45] GenomeInfoDbData_1.2.7 Matrix_1.4-0
## [47] Rcpp_1.0.8.3 munsell_0.5.0
## [49] S4Vectors_0.33.11 fansi_1.0.3
## [51] lifecycle_1.0.1 stringi_1.7.6
## [53] yaml_2.3.5 SummarizedExperiment_1.25.3
## [55] zlibbioc_1.41.0 plyr_1.8.7
## [57] grid_4.2.0 blob_1.2.2
## [59] parallel_4.2.0 promises_1.2.0.1
## [61] crayon_1.5.1 lattice_0.20-45
## [63] Biostrings_2.63.2 hms_1.1.1
## [65] KEGGREST_1.35.0 pillar_1.7.0
## [67] GenomicRanges_1.47.6 reshape2_1.4.4
## [69] stats4_4.2.0 glue_1.6.2
## [71] BiocVersion_3.15.0 evaluate_0.15
## [73] BiocManager_1.30.16 png_0.1-7
## [75] vctrs_0.3.8 tzdb_0.2.0
## [77] httpuv_1.6.5 gtable_0.3.0
## [79] purrr_0.3.4 assertthat_0.2.1
## [81] cachem_1.0.6 ggplot2_3.3.5
## [83] xfun_0.29 mime_0.12
## [85] xtable_1.8-4 later_1.3.0
## [87] tibble_3.1.6 AnnotationDbi_1.57.1
## [89] memoise_2.0.1 IRanges_2.29.1
## [91] ellipsis_0.3.2 interactiveDisplayBase_1.33.0
## [93] BiocStyle_2.23.1